Cross-calibration of cluster mass-observables 
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This paper is a first step towards developing a formalism to optimally extract dark energy infor- 
mation from number counts using multiple cluster observation techniques. We use a Fisher matrix 
analysis to study the improvements in the joint dark energy and cluster mass-observables constraints 
resulting from combining cluster counts and clustering abundances measured with different tech- 
niques. We use our formalism to forecast the constraints in £7de and w from combining optical and 
SZ cluster counting on a 4000 sq. degree patch of sky. We find that this cross-calibration approach 
yields ~ 2 times better constraints on Que and w compared to simply adding the Fisher matrices 
of the individually self-calibrated counts. The cross-calibrated constraints are less sensitive to vari- 
ations in the mass threshold or maximum redshift range. A by-product of our technique is that the 
correlation between different mass-observables is well constrained without the need of additional 
priors on its value. 



I. INTRODUCTION 

The evolution of the number of clusters of galaxies pro- 
vides a powerful tool to study the nature of dark energy. 
Clusters are sensitive probes of the growth of structure 
because cluster abundances are exponentially dependent 
on the linear density perturbation field. In addition, clus- 
ter surveys are sensitive to the evolution of the volume 
element with redshift so that cluster surveys also probe 
the background cosmology. 

Planned and ongoing cluster surveys will detect mil- 
lions of clusters using a variety of techniques such as 
counts of optically detected galaxies (e.g. DES [l|, LSST, 
0), the Sunyaev-Zel'dovich (SZ) flux decrement (e.g. 
SPT [H and ACT 0]), X-ray temperature and surface 
brightness (e.g. eRosita, and weak lensing shear. 
Because different cluster techniques suffer from different 
sources of errors, combining the information from dif- 
ferent surveys is essential to reduce random errors and 
control the systematics. 

One of the major challenges in extracting dark energy 
information from clusters is that cluster masses are not 
directly observable. One must rely on observable proxies 
for mass which only correlate statistically with the true 
mass. The inherent uncertainties in the observable-mass 
relation will degrade cosmological constraints if not well 
understood. Methods have been developed to use addi- 
tional cluster properties such as the cluster power spec- 
trum [gj], sample covariance from counts in cells 0L or 
the shape of the observed mass function @, [13, HH to 
"self-calibrate" the mass-observable relation by simulta- 
neously solving for the cosmological and mass-observable 
parameters. 

Other works have investigated combining different 
cluster techniques to cross-calibrate the mass-observable 



relations of each [fj, [l2|, [l3[. In [fj, [l2|, the cross- 
calibration is between an SZ or X-ray survey and a de- 
tailed mass follow-up to calibrate the mass-observable 
relation, whereas [1 31 ] combine SZ and X-ray surveys. 
However, these studies have assumed that the two sur- 
veys were independent, so that the joint constraints were 
estimated by adding the Fisher matrices of both exper- 
iments. But if two surveys observe the same patch of 
sky, the measurements are not independent. The goal 
of this paper is to show how to exploit the interdepen- 
dence of cluster surveys over the same patch of sky to 
improve constraints on dark energy and mass-nuisance 
parameters. 

The paper is organized as follows. In f|TT]wc describe 
the Fisher matrix formalism to forecast cosmological con- 
straints from cluster counts and clustering using a single 
and multiple observables. We describe the major cluster 
mass determination techniques in and explain our 
parametrization of the errors in the observables, i.e. the 
mass-observable distributions. Results are presented in 
QYV\ and our conclusions and prospects for future work 
are given in CVl 



II. SELF-CALIBRATION AND THE FISHER 
MATRIX FORMALISM 

In this section we review how to obtain cosmological 
constraints from cluster counts and clustering using a 
single or multiple observables. Combining counts and 
clustering to derive cosmological constraints from a single 
mass estimation technique is often referred to as self- 
calibration. 



Mean number counts 
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The use of clusters of galaxies as cosmological indi- 
cators depends on how reliably N-body simulations can 
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predict the number density of dark matter halos associ- 
ated to clusters of a given mass given an initial power 
spectrum. Wc adopt the fitting function of [3] for dif- 
ferential comoving number density of clusters 



dn 



dlnM 



°- 3 ir^^Hi™-' + 0.041 



3.82l 



(1) 



where a 2 (M,z) is the variance of the density field in 
a spherical region with mean (present-day) matter den- 
sity p m encircling a mass M, Even though more recent 
fitting- functions exist (e.g. [H, HH), we adopt the above 
for easier comparison with the literature (e.g. @, BE3) 
and because the results are relatively insensitive to the 
fiducial mass function used. 

Eq. {T]) shows that the number density of clusters is 
sensitive to the variance of the density field, and hence 
to the initial power spectrum. However, uncertainties in 
the estimation of the mass are degenerate with changes in 
cosmological parameters. The utility of cluster number 
counts is therefore limited by uncertainties in the mass- 
observable relation. Results from both simulations (e.g. 
[3 El) and observations (e.g. 0, HI, H| ) suggest that 
the mass-observable relations can be parametrized in sim- 
ple forms with lognormal scatter of the mass-observable 
about the mean relation. Other works (see e.g. (23J) 
suggest that the distribution of galaxies in halos may be 
more complicated. 

For n observables, the probability of measuring clusters 
given the true mass M and redshift z is 



p(M obs ,z p |Af,z)0(M' 



obs 



(2) 



where M obs = (M° bs , M 2 obs , M° hs ) and (j)(M ohs ) is the 
combined selection function for all the observables. For 
simplicity, we always work in a range of redshift and mass 
where the surveys are expected to be nearly complete. 
This allows us to approximate the selection function as 
unity. This range depends on the observable we are us- 
ing, so we postpone justifying our assumptions for survey 
selections to mill when wc describe the different cluster 
techniques. We further assume that the redshift errors 
are independent of the mass-observable errors. This as- 
sumption is not strictly true, since the bigger the cluster, 
the more bright optical galaxies it should have, and the 
better the cluster redshift estimate will be. This is par- 
ticularly relevant for optical clusters, for which the clus- 
ter detection and mass estimate are inseparable from the 



cluster redshift determination. Wc will postpone dealing 
with this difficulty to a later work. For now, we write 



p(M ohs , z p \M, z) = p(M obs \M)p(z p \z 



obs I 



(3) 



We define the probability of measuring the observable 
M obs given the true mass M as Q 



p(M ohs \M) = 



InM 



exp[-.T 2 (A/ obs )] , (4) 



where 



b ^ lnM° ba - lnA/-lnM bias (A/,z) 
y/2<Tin m(M,z)* 

We describe our parametrization of M blas (M, z) and 
&in m(M, z) 2 in ^Illl when we discuss our modeling of dif- 
ferent cluster techniques. 

The number density of clusters at a given redshift z 
with observable in the range M° bs < Af obs < is 
given by 



n a (z) 



+ 1 dM ohs f dM dn 



M° bs / M din ill 



p(M obs \M)(6) 



where x a = x(M° hs ). 

We define the probability of measuring two observables 
M° hs , M£ hs given the true mass as a bivariate Gaussian 
distribution 

p(Mf s ,M% hs \M) 



(2tt) dct(C) 1 /2 



exp 



where C is the covariance matrix defined as 

C = ( ^ paa<Tb 
\p<j a a b of 



(7) 
(8) 



and p £ [— 1, 1] is the correlation coefficient. We motivate 
the use of the bivariate distribution in Appendix [XJ 

At a given redshift z, the average number density of 
clusters with observables such that < M° hs < 

M° b Q s +1 and il/° bs < M° bs < Af° bs +1 is given by 



~ l a,p( Z ) 



Af° bs 



M >>J+i dM° hs 
M° hs 



y/n f dM dn 



b,0 



dM dn 
TTdln M 



p(M a obs ,M° bs |M) 



M d\nM 



a 



M obs 



erfc 



px a - XbjM^) 



erfc 



P X a ~ X b (Mffi +1 ) 



(9) 
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For the two observables case, the integrals over the ob- 
scrvables can only be performed analytically if p = 0. 
One would think that this problem could be resolved by 
diagonalizing the inverse covariance matrix - defined in 
Eq. ©. Diagonalization, however, does not simplify 
the calculation because the limits of the innermost in- 
tegral over observables become dependent on the other 
observable. Thus, one cannot avoid performing the nu- 
merical integration. The equation for b(z) is modified 
analogously to Eq. ([9]) . 

We interpret Eq. ([9]) as the combination of the error- 
free number density multiplied by two window-functions 
defined as: 



(10) 



and 



erfc 



px a 



-erfc 



PX a 



(11) 



Window Wl has characteristic width given by the scatter 
of the observable a with respect to the true mass, and is 
centered, in the hiM° 8 — InM coordinate, at the bias 
in the mass-observable relation, lnM blas . The shape and 
position of window in (lnA/° bs — InM) depend on 
the value of the correlation coefficient p as well as on the 
boundaries of the mass bin of the observable b, M^ s and 
M^ s +1 . If p = 0, W| is simply a constant, independent of 
M° bs and M, as expected. For finite p, W| has the shape 



of a Mexican hat. As 



1, W£ approaches a top- 



hat function, with edges at Xb{M^§) and xj,(M^ +1 ) for 



-x b (M$ +1 ) and 



-x b {M£ h p) for negative 



positive p or at 

p. WJ is not invariant under p — > —p transformations. 
Decreasing p "spreads out" the number counts in the 
M° hs - M b obs plane. If the observables have different 
scatter, the spreading will be asymmetric with respect to 
the M° bs = M£ hs line. In other words, variations in p 
are partially degenerate with both the scatter and bias 
of the different observables. 

The mean cluster number counts are given by inte- 
grating Eq. ^ or (Eq. [5]) over comoving volume. In 
spherical comoving coordinates, the volume element dV 
is 



P(z p \z) 



V27 



:exp [V(z p )] 



(13) 



where 



(14) 



and z blas = z blas (z) is the photometric redshift bias 
and a\ = o\[z) is the variance in the photo-z's. We 
parametrize them as 



Z blaS (z)EEZ bIaS + d 1 (l + z) 

a z (z) = <r° + ei(l + z) 



(15) 
(16) 



For this paper we set the fiducial values z blas = di = 
C\ = 0, and a° z = 0.02, the expected overall scatter of 
cluster photo-z's in the Dark Energy Survey [l|. We hold 
these parameters fixed throughout. 

Assuming perfect angular selection the mean number 
of clusters in a photo-z bin zf < z p < is 



m 



dz* / dVn at/3 Wt(n)p(z p \z) (17) 



where W^ h (fl) is an angular top hat window function. 

To simplify the notation, henceforth we use the index 
a to indicate bins of both observables. 



B. Noise in counts 

The number of clusters found in an angular/redshift 
bin can deviate from the mean counts because of Poisson 
noise and large scale structure clustering. Both effects 
must be included in any likelihood analysis. On cluster 
scales, the clustering of baryonic matter follows the linear 
density fluctuations of total matter S(x) corrected by the 
linear bias. That is, 

m a) i(x) = m at i[l + b aji (z)5(-x)], (18) 
where b a ,i(z) is the average cluster linear bias defined as 



r 2 (z) , , , If dM° hs f dM° bs f dM 

dV = r 2 drdQ = -JfidzdSl, (12) b a Az) - 



H(z) ' y J ' n a ,i(z)J M° bs J M° bs J M 

where H(z) is the Hubble parameter at redshift z, r(z) is x ^"-'^ b(M; z)p(M ob8 |M). (19) 

the comoving angular diameter distance and dQ is the dif- d In M 

ferential solid angle. Uncertainties in the redshifts distort yy Q ac j p^ ^h e z ) fit of [24| - 
the volume element. Assuming photometric techniques 
are used to determine the redshifts of the clusters, we 

parametrize the probability of measuring a photometric UM- z) — 1 + flc ^/ cr2 ~ 1 , 2p c ^q-j 

redshift, z p , given the true cluster redshift z as 17| \ > ) ^ 6 C [1 + (a6 2 /a 2 )P c ] 
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with a c = 0.75, p c = 0.3, and S c = 1.69. 

The sample covariance of counts m a ^ is, given by Q 



qafi 



(2tt) £ 



(21) 



W*{V)Wj{^)JPi{k)Pj{k), (22) 



((m a ,i - m Q> i)(m / 3 ! j - mpj)) 
d 3 k 



where W* (k) is the Fourier transform of the top-hat win- 
dow function and Pi(k) is the linear power spectrum at 
the centroid of redshift bin i. Notice that, in contrast to 
[§], we use y / Pi(k)Pj(k) instead of P(k) at an average 
redshift. We do not notice significant differences from 
this change. In addition, for computational efficiency, we 
only calculate covariance terms for which \i — j\ < 1 and 
set the remaining terms to zero. Going from Eq. (|2~lj) 
to Eq. (|22|) we assumed that the bias was approximately 
constant in each photo-z bin so that it could be removed 
from the integral. We only considered the sample covari- 
ance in bins of redshift, but the angular covariance also 
contains useful information. We postpone calculating the 
full sample covariance to a future work. 

Following fl7| , we find that the window function 
W* (k) in the presence of photo-z errors is given by 



Wi(k) 



exp 



ikw 



H, 



exp 



2Hf 



sin(fc||<5r i /2) Ji^r^) 
k\\8ri/2 k±n9 s 



(23) 



Here r» = r(zf) is the angular diameter distance to the 
photo-z bin, and Sr.i — r(zf +1 ) — r(zf). Similarly, 



;th 



Hi = H(zf) = H(z), z\ 



3 (z), and 



o z ,i = &z(zf) = o z {z). We assumed that H(z), z hlas (z), 
and o z (z) are constant inside each bin. 

The Poisson noise of the counts is fully specified by the 
mean counts to. The sample variance in the counts is de- 
termined by the mean counts, the bias, and the initial 
power spectrum. Since all these quantities can be pre- 
dicted theoretically, both the mean counts and the sam- 
ple variance contain useful information. In the following 
section we use the Fisher matrix formalism to estimate 
joint constraints for dark energy and mass-observable pa- 
rameters using the information in the counts and the 
noise. 



Fisher Matrix 



The marginalized errors in the parameters are given 

1 /2 

by cr{pa) = \_{F~ 1 ) aa \ . Priors are easily incorporated 
into the Fisher matrix. If parameter pi has a prior un- 
certainty of <j{pi), we simply add a{pi)~ 2 to the Fa entry 
of the Fisher matrix before inverting. 
Define the covariance matrix 



Cii — Si 



m l 8 l j 



(25) 



where fhi is the vector of mean counts defined in Eq. (JTTJ) 
and Sij is the sample covariance defined in Eq. (|22|) . The 
indices i and j here run over all mass and redshift bins. 
Assuming Poisson noise and sample variance are the only 
sources of noise, the Fisher matrix is, @, HH, [26| 



F aP = m'C- 1 ^ + -'&[C- 1 S ia C- 1 S^] ) 



(26) 



where the "," denote derivatives with respect to the 
model parameters. The first term on the right-hand side 
contains the "information" from the mean counts, to. 
The Sij matrix only contributes noise to this term, and 
hence only reduces its information content. The second 
term contains the information from the sample covari- 
ance. 

For our purposes, the model parameters are the cosmo- 
logical parameters, the parameters describing the errors 
in the observables (i.e. the mass nuisance parameters), 
and the parameters of the photo-z errors. We use two sets 
of fiducial cosmological parameters. One set is based on 
the first year data release of the Wilkinson Microwave 
Anisotropy Probe (WMAP1, (27[) and the other is based 
on the third- year data release (WMAP3, [28j]). We use 
WMAP1 and WMAP3 instead of the more recent five- 
year data release because the WMAP1 and WMAP3 are 
more extreme cases with regards to the value of as and 
the predicted number counts, and hence WMAP5 is more 
or less in-between both of them. The WMAP1 parame- 
ters assumed are: the baryon density, Sl^h 2 = 0.024, the 
dark matter density, £l m h 2 = 0.14, the normalization of 
the power spectrum at k = 0.05Mpc _1 , = 5.07 x 10~ 5 , 
the tilt, n = 1.0, the optical depth to reionization, 
t = 0.17, the dark energy density, f^E = 0.73, and 
the dark energy equation of state, w = — 1. In this cos- 
mology, cr 8 = 0.91. For WMAP3 we set tt b h 2 = 0.0223, 
fl m h 2 = 0.128, S c = 4.053 x 10~ 5 at k = 0.05Mpc _1 , 
n = 0.958, r = 0.093, fl DE = 0.73, and w = -1. This 
cosmology corresponds to erg = 0.76. With the exception 
of w, the cosmological parameters we used have been de- 
termined to an accuracy of a few percent. Extrapolating 
into the future, we assume 1% priors on all cosmolo gica l 
parameters except J7de and w. We used CMBfast [29l ]. 
version 4.5.1, to calculate the transfer functions. 



Given a model specified by a set of parameters p a , with HI. CLUSTER MASS DETERMINATION 

likelihood L, the Fisher information matrix is defined as TECHNIQUES 

/ 9 2 lnL \ There are four commonly used cluster detection tech- 

a ^ \dp a dpi3 J niqucs for which large surveys are planned: optical, X- 
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FIG. 1: (Left) Mean counts as a function of rcdshift m(z) for various mass thresholds, with <7i n M = 0.25 for both WMAP1 and 
WMAP3 cosmologies. (Right) m(z) for various values of a\ n M, with M th — 10 14 ' 2 h~ 1 Mq assuming a WMAP3 cosmology. 



ray, Sunyaev-Zcldovich flux decrement, and weak lens- 
ing. For our Fisher matrix purposes, each of them is 
fully specified by a mass threshold, survey area, maxi- 
mum redshift, and the parameters for the fiducial errors 
in AI ohs and z p . 

We show the mean number counts per redshift bin per 
sq. degree as a function of photometric redshift (with a 
constant scatter of <r° = 0.02) for several mass thresh- 
olds and scatters in Fig [TJ The left plot shows the mean 
counts for M th = 10 13 5 , 10 139 , 10 142 , and 10 14 - 2 h^M®, 
for a fixed scatter of <7i n M = 0.25. The sensitivity of the 
counts to the mass threshold is apparent. The plot on the 
right shows the mean counts for u\ n M = 0.01, 0.25, 0.5, 
1.0 with the threshold set to M th = lO 14 - 2 ft -1 M . The 
increase of the scatter results in an increase in the total 
counts because the mass function falls exponentially with 
mass. It also causes flattening of the fh(z) curve. The 
increase in the scatter implies an increase in the variance 
in counts, but a decrease in the shot noise. For perfectly 
known scatter, the decrease in shot noise outweighs the 
increase in variance implying that more scatter can yield 
better cosmological constraints. However, it is harder 
to constrain larger scatter and its evolution, and the as- 
sumption of Gaussianity may break down. This issue 
is particularly relevant for a WMAP3 cosmology, where 
there are fewer clusters compared to WMAP1. 

Since the focus of this paper is on combining clusters 
in the same area of the sky, we limit our tests to surveys 
overlapping the South Pole Telescope (SPT) SZ Cluster 
survey. We thus set the area of the sky to 4000 square de- 
grees, which we subdivide into 400 bins of 10 sq. degrees 
each. We assume SPT will be able to observe clusters 
with M obs > lO u ' 2 /i- 1 M up to a rcdshift of 2 (see e.g. 
pfl]). We assume that photometric redshifts will be avail- 
able using DES+VISTA photometry. We parametrize 



the SZ mass bias and variance as 

lnM bias (M,2) = lnM bias + ai (l + z) (27) 
= lnA/ bias + lnM bias (z) (28) 

3 

crf nM (M,z) = a 2 +]>>z l (29) 
i=l 

= 4+<m{z) (30) 

We set the fiducial mass scatter to cto = 0.25, and all 
the other nuisance parameters to zero. In total, we use 
six nuisance parameters for the scatter and bias in mass 
(lnM bias , oi, al h). 

We assume a DES-like optical cluster survey with fidu- 
cial mass threshold of M th = 10 13 - 5 h~ 1 M Q and max- 
imum redshift of 1. [3^ and [33[ were able to detect 
clusters with mass greater than 10 13 5 /i _1 Mq with a 
high level of purity and completeness using photomet- 
ric data from the Sloan Digital Sky Survey (SDSS, [HI). 
The MaxBCG method used by these authors relies on 
red cluster galaxies occupying a distinct region in color 
space, the red sequence. The red sequence is known to be 
present in clusters at least to redshift of 1 (see e.g. [H[), 
so that we are justified in our choice for the expected DES 
mass threshold. Our choice of maximum rcdshift is some- 
what conservative since with the addition of the IR filters 
from VISTA survey, DES+VISTA will have accurate red- 
shifts (for field galaxies) up to z ~ 1.5. Conversely, the 
maximum redshift of 2 for SPT relies on the expecta- 
tion that a deeper optical follow-up may be available for 
SPT-detected clusters. We show in ^IIVI that if the cross- 
calibration is performed, the SZ clusters above z ~ 1 
contribute very little to the cosmological constraints. 

Different studies suggest a wide-range of scatter for op- 
tical observables, ranging from a constant ct\ u m = 0.5 
[35| to a mass-dependent scatter in the range 0.75 < 
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cinM < 1-2 j36[. After the submission of this paper, 
a couple of papers made more optimistic estimates for 
the scatter. Using weak lensing and X-ray analysis of 
MaxBCG selected optical clusters 37( estimated a scat- 
ter of ~ 0.45 between weak lensing and optical richness 
estimates. In [38[ the authors show that improved rich- 
ness estimators may reduce the optical scatter. As a con- 
servative compromise, we choose a fiducial mass scatter 
of (Tin m = 0.5 and allow for a cubic evolution in rcdshift 
and mass: 

lnM hias {M,z) = lnA/ bias + ai(l + z) 

+ a2 (lnM obs -lnAf pivot ) (31) 
= lnA/ bias + lnA/ bias (z) + lnAf bias (M) 



+ Ci(lnM obs - InMpivot)*. (32) 

i=l 
°0 + CT ln M 

(A/) 



We set ln(M pivot ) = 34.5 (with M in units of h~ 1 M Q ). 
In all, we have 10 nuisance parameters for the optical 
mass errors (lnAf blas , oi, d2, Cq, bi, Ci). The results 
we obtain are sensitive to the choice of parametrization, 
particularly the number of nuisance parameters. There 
are few, if any, constraints on the number of parame- 
ters necessary to realistically describe the evolution of 
the variance and bias with mass for any technique. If 
simpler paramctrizations than the ones we adopt here 
should prove to describe the variations in the errors well, 
than cosmological constraints would improve. 



A. Redshift /observables space 

To calculate the SZ counts and sample variance, we use 
mass bins of width log(AA/ obs ) = 0.2 with the exception 
of the highest mass bin, which we extend to infinity We 
set the width of our rcdshift bins to Az p = 0.1. These 
bin sizes imply 5 bins of mass and 20 rcdshift bins for 
the SZ clusters. For the fiducial optical parameters, we 
divide the mass range 10 135 < Af° b t s < lO^h^M® 
into 5 bins and use the same mass binning as the SZ for 
Af° b t s > lO 14 - 2 /i- 1 M , with a total of 10 mass bins and 
10 redshift bins. 

If the clusters detected by the optical and SZ surveys 
are in different parts of the sky, then the samples are in- 
dependent. To estimate the joint constraints from both 
surveys one simply applies the single mass-observable 
analysis described in the previous section to each of the 
samples and sums the Fisher matrices. 

If the clusters are all in the same part of the sky, 
then the samples are not independent. In addition, 
some regions of redshift/observable space contain clus- 
ters detected by both methods or only one. Our cross- 
calibration approach calculates the mean counts and clus- 



tering at all bins shown in Fig. [2j From Fig. [2] one can 
see that the observables parameter space is composed of 
four parts. One is defined as the set of clusters for which 
10 13 - 5 < A/° b t s < IO^/i^Mq, M° bs < l0 14 - 2 h^M Q , 
and < z < 1. Only optical clusters are detected in this 
region. We divide that interval of mass into 5 equally 
spaced bins and use P(M° b t s \M) to estimate the counts in 
that region. The second region is defined as the clusters 
for which A/° bs > lO 14 - 2 ft- 1 M , Af° b t s > lO 13 - 5 ^" 1 Af 
and < z < 1. The mass bins are simply the outer 
product of the optical and SZ vectors of bins of observ- 
ables in that range. It is comprised of 5 x 10 mass bins 
and 10 redshift bins. Here we use P ( Af ° b t s , Af ° bs | Af ) 
to estimate the counts. The third region is defined 
by M° bs > 10 14 - 2 /i- 1 M Q , Af° b t s < lO 13 - 5 ft- 1 M and 

< z < 1. Because there are almost no clusters de- 
tected in this region, we do not include it in our analysis. 
The fourth region is defined by M s ° bs > lO 142 ^ 1 M and 

1 < z < 2. Since only SZ clusters can be found in this 
region we estimate the counts using P(Af° bs | Af ). The 
counts from the three regions we use are organized into 
a single vector of counts, and the corresponding covari- 
ance of the data (defined in Eq. |2"5")) is given by a single 
matrix. 

Fig. [1] hints that our choice of binning results in a 
large number of bins with mean counts substantially be- 
low unity. Such small number of clusters per bin brings 
about two concerns. The first is that in a real survey 
one would not be able to accurately estimate the mean 
of such bins. While this is true, our goal in this paper 
is to examine how much information is in the counts, 
which we can only be certain of extracting using a large 
number of bins. Our choice of binning does not yield 
overly optimistic results since the shot noise increases as 
the counts per bin become smaller. The bins with very 
few objects therefore do not contribute significantly to 
the Fisher matrix. We tested this using a total of 32 bins 
instead of 50 (in the region of overlap of the surveys) and 
found only negligible differences in the resulting dark en- 
ergy constraints. When performing this analysis on real 
datasets, one would be advised to adopt a different bin- 
ning strategy, perhaps using tree-structure algorithms to 
optimally subdivide the data, or hierarchical Baycsian 
classification algorithms, especially if more than two ob- 
servables are used. 

The second concern is that with few objects per bin 
the Gaussian approximation assumed when we defined 
Eq. (|2l)|) - see [7| for a derivation - is not valid. To test 
the impact of the Gaussian assumption, we performed 
the single-observable self-calibration analysis for the SZ 
survey using 5, 10, and 40 mass bins. The results are 
virtually identical if 5 or 10 bins are used, but degrade 
by a few percent for 40 bins. We did not investigate 
whether the degradation was a result of the breakdown 
of the Gaussian assumption or simply due to numerical 
noise. The important point is that excessive binning does 
not yield unrealistic improvements in the constraints. 
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FIG. 2: Optical-SZ mass bins in the redshift range (left) < z < 1 and (right) 1 < z < 2. The black lines indicate the 
mass-threshold for the SZ and optical surveys. The gray lines show the boundaries of the mass bins. We do not use the SZ 
only region marked with the asterisk because there are very few clusters in that region. 



IV. RESULTS 

Unless stated otherwise, all results shown assume no 
priors on the nuisance parameters. 

A. Results for a single observable 

First, we present results for a single observable. Figure 
[3] shows the dependence of the constraints on f^E (left) 
and w (right) on the maximum redshift of the survey 
(z max ). The dashed and solid black lines are for the fidu- 
cial optical mass threshold, scatter and bias in WMAP1 
and WMAP3 cosmologies, respectively. The dashed and 
solid gray lines are the corresponding results assuming 
the fiducial SZ survey. The rate of improvement in the 
Ofje constraints with z max decreases sharply after z ~ 0.5 
for all cases except the optical results in WMAP3, where 
the break happens around z ~ 1. The constraints on w 
show a more pronounced redshift dependence for both 
optical and SZ. In a WMAP3 cosmology, varying z max 
from 1 to 2 results in a(w) decreasing by a factor of ~ 2.5 
for the optical and ~ 2.1 for the SZ. The intersection of 
the dashed lines in both plots, or of the solid lines in the 
left plot mark the redshifts below which the optical sur- 
vey yields tighter constraints than the SZ survey. At this 
point, Poisson noise in the counts is the dominant com- 
ponent of the error budget. The increase in counts due to 
the larger scatter of the optical observable compensates 
for the loss of information due to increased scatter. 

Figure [J] shows (left) ct^de) and (right) a(w) ver- 
sus the mass threshold of the survey in a WMAP3 cos- 
mology. The number of mass bins used in the calcula- 
tion is different for each M th . At the lowest threshold 
M th = ioi3-2 ft -i M and there 16 bins of M obs^ We 



increase M th in steps of AlnAf obs = 0.1 and decrease 
the number of mass bins by one at every step up to 
M th = 10 14 ' 7 /i _1 M©. The solid black and solid gray 
lines show the marginalized constraints for the fiducial 
optical and SZ paramctrizations. For the dashed black 
line we assume no mass dependence in the optical mass 
scatter, i.e. we use the same paramctrization as the fidu- 
cial SZ survey, except that o~o = 0.5, and the maximum 
redshift is 1 . The fact that the dashed black line drops be- 
low the gray line in the left plot is another illustration of 
the point made in £11111 of larger scatter resulting in bet- 
ter cosmological constraints, despite the lower redshift 
range of the optical survey and no priors on the scatter. 
Allowing for mass dependence of lnM^ 8 and er^ t not 
only degrades ct^de) but also increases the sensitivity of 
the constraints to M th . The constraints on w are much 
less affected, because of the low maximum redshift of the 
optical survey. 



B. Results for two observables 

Figure \5\ shows the 68% confidence regions for JIde 
and w in (left) WMAP1 and (right) WMAP3 cosmolo- 
gies assuming no priors in the nuisance parameters and 
no correlation between the observables (i.e. p = 0, fixed). 
Comparing both plots, we see that the low fiducial num- 
ber of clusters in the WMAP3 cosmology implies weaker 
cosmological constraints. More interestingly, in a cos- 
mology with fewer clusters the lower mass threshold of 
the optical technique makes it more constraining than 
the fiducial SZ even without any priors on the bias or 
scatter. The marginalized constraints are summarized in 
Table H 

Performing the cross-calibration using only clusters 
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FIG. 3: Constraints on (left) Que and (right) w versus the maximum redshift of the survey for the fiducial optical and SZ 
surveys in WMAP1 and WMAP3 cosmologies. 




FIG. 4: Constraints on (left) Q,de and (right) w versus the mass threshold of the survey in a WMAP3 cosmology. The number 
of mass bins used in the calculation is different for each M th . At the lowest threshold M th = 10 13 ' 2 h~ x M@ and 16 bins of 
M obs are used. We increase M th in steps of AlnM obb = 0.1 and decrease the number of mass bins by one at every step up 
to M th = 10 14 ' 7 /i -1 Mq. The solid black and solid gray lines are the marginalized constraints for the fiducial optical and SZ 
parametrizations. For the dashed black line we assume no mass dependence in the optical mass scatter, i.e. it uses the exact 
same parametrization as the fiducial SZ survey, except that ctq = 0.5 and the maximum redshift is 1. 



detected by both methods (hereafter partial cross- 
calibration - represented in the plots by the filled gray 
ellipses) does not yield very good constraints. The par- 
tial cross-calibration is slightly more useful in a WMAP3 
cosmology, because there are few clusters above z = 1, 
so that not using that region of parameter space does 
not cause much degradation. Constraints using the cross- 
calibration with all clusters available (hereafter full cross- 
calibration - filled black ellipses) yields much better con- 
straints than the partial cross-calibration. In fact, con- 
straints on £7rjE and w from the full cross-calibration are 



a factor ~ 2 better than constraints derived by simply 
adding the Fisher matrices of the optical and SZ tech- 
niques (the solid black line). 

We demonstrate the importance of clustering in a 
WMAP3 cosmology to self- and cross-calibration in Fig. 
IH1 Comparing the filled light gray ellipse with the solid 
black line, we see that clustering information tightens 
constraints on both JIde and w significantly if we only 
sum the optical and SZ Fisher matrices. But compar- 
ing the filled dark gray ellipse with the filled black ellipse 
we see that clustering does not add as much informa- 
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FIG. 5: 68% confidence regions in the S7de — w plane in (left) WMAP1 and (right) WMAP3 cosmologies. The constraints from 
cross-calibration using only clusters detected simultaneously in optical and SZ (i.e. partial cross-calibration - with selection 
M° bs > 10 14 ' 2 /i _1 M Q , M„° p b t s > 1O 13 ' 5 /i _1 M and < z < 1) are represented by the filled gray ellipses. The cross-calibration 
using all clusters (i.e. full cross-calibration) yields the filled black ellipses. For comparison, the long dashed red lines show 
constraints for the fiducial optical survey, and the short dashed blue lines show constraints for the fiducial SZ survey. Treating 
the optical and SZ surveys as independent and adding their Fisher matrices yields the solid black lines. 
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TABLE I: Marginalized constraints on cosmological parame- 
ters 



WMAP1 



WMAP3 



Survey 


cr(fiDE) 


a(w) 


a(fluE) 


a(w) 


Intersection 


0.058 


0.093 


0.070 


0.15 


Optical 


0.057 


0.098 


0.10 


0.13 


SZ 


0.050 


0.11 


0.074 


0.21 


Optical + SZ 


0.032 


0.062 


0.057 


0.092 


Cross-Cal. (Full) c 


0.021 


0.030 


0.025 


0.045 


Cross-Cal. (z^ ax < l.l) c 


0.022 


0.032 


0.026 


0.047 



Fixed p — 



FIG. 6: The filled light gray ellipse shows the constraints from 
summing the SZ and optical fisher matrices without cluster- 
ing. The solid black line indicates the corresponding con- 
straints when clustering is added. The filled dark gray and 
filled black ellipses show the full cross-calibration constraints 
without and with clustering, respectively. 



tion to the full cross-calibration. Constraints on w are 
unchanged, and Ode constraints improve by a factor of 
- 1.7. 

Figure [7] shows ct(17de) and cr(w) for the full cross- 
calibration as a function of the optical mass threshold, 

fopt 



M.T*, in both WMAP1 and WMAP3 cosmologies with p 



fixed at zero. The dots indicate boundaries of the mass 
bins for M°£ t s < IO^/T^Mq. Above 10 14 2 we use the 
same bins as the M s ° z bs . Constraints on w are slightly 
less sensitive to Af°p t 8 than constraints on JIde- Com- 
paring the slopes of the curves in Figure [7| and Figure 
2] we see that the full cross-calibration constraints are 
less sensitive to M th than the self-calibrated constraints 
from optical or SZ alone. In Fig. [4] a change in M th from 
10 13.5^-i Mq t0 ioU-2 h -i MQ rcsu lts in a degradation of 
a(w) and ct^de) of ~ 4.0 and ~ 3.6, respectively, for 
optical only, and of ~ 5.9 and ~ 4.0 for SZ only. With 
the full cross-calibration, the degradation factor is only 
~ 3.0 for a(fluE) and ~ 3.3 for a(w). 

The full cross-calibration also reduces the sensitivity 
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FIG. 7: ij(Qde) and er(iu) for the full cross-calibration as a 
function of the optical mass threshold, M° h pt in both WMAP1 
and WMAP3 cosmologies with correlation p fixed at zero. 
The dots indicate boundaries of the mass bins for Af°p t s < 
W^h^MQ. Above 10 14 ' 2 we use the same bins as for M° bs . 
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FIG. 8: a(f^DE) and a(w) for the full cross-calibration as a 
function of the maximum redshift of the optical survey, in 
WMAP1 and WMAP3 cosmologies with correlation p fixed 
at zero. 



to the maximum redshift range of the surveys. Figure [8] 
shows cr (Ode) and cr(w) as a function of the maximum 
redshift of the optical survey for the full cross-calibration. 
Comparing to Figure [3] it is clear that the individual sur- 
veys are much more sensitive to z max than the full cross- 
calibration. For example, if z max changes from 1 to 2 
in a WMAP3 cosmology, the optical-only and SZ-only 
constraints on w improve by factors of ~ 2.2 and ^2.0, 



respectively. In comparison, the same change in z max for 
the optical survey in the full cross-calibration improves w 
constraints by only ~ 1.3. Cross-calibration constraints 
are even less sensitive to variations in the maximum red- 
shift of the SZ survey. For a fixed optical z max = 1, 
reducing the SZ z max from 2 to 1.1 degrades constraints 
by only a few percent in both cosmologies. In this sce- 
nario, we find a(Q DE ,w) = (0.022,0.048) in a WMAP1 
cosmology and a(Q DE ,w) = (0.027,0.073) in a WMAP3 
cosmology. 

All cross-calibration results shown heretofore assumed 
correlation coefficient p fixed at zero. From Eq. (|A14|) wc 
see that p = implies a a b = 0opt-sz = oo. Weak lens- 
ing and X-ray mass measurements of optically-selected 
clusters suggest that a more realistic guess would be 
Copt-sz ~ 0.3 — 0.7, from which Eq. (|A14[) implies that 
0.19 < \p\ < 0.55. A value of p > 0.6 corresponds to 
Copt-sz < 0.19. To obtain higher correlation values, one 
would need a a \, to be small compared to a a and o\,. 

Figure ([9]) shows the dependence of the constraints on 
the dark energy and optical mass nuisance parameters on 
the correlation coefficient. From the left plot we see that 
the dark energy parameters are insensitive to the value 
of the correlation for p < 0.6 for the full cross-calibration 
analysis. The very sharp drop in the uncertainties of 
both cosmological and nuisance parameters is largely due 
to the optical and SZ surveys having different fiducial 
scatters and mass thresholds. Given cr op t and csz, high 
values of the correlation imply very low values of CT op t- S z, 
the scatter between observables. High correlation means 
that the scatter in the optical is effectively that of the 
SZ survey. From the plot we see that p = 0.8, the com- 
bination of optical and SZ results yields constraints very 
similar to a survey with optical M th but with SZ scatter 
(cf. Fig. El). 

The constraints on p improve as p increases, though 
comparing constraints for fixed and free p, we see that 
dark energy constraints are fairly insensitive to u(p). 
This means that the correlation is sufficiently well de- 
termined by the cross-calibration analysis without need 
for additional priors. 

In the right plot, we see that for the cross-calibration 
using only clusters detected by both methods (i.e. the 
partial cross-calibration) the constraints are more depen- 
dent on the value of the correlation and on its uncer- 
tainty. The relation between p and the optical bias is 
most pronounced. As mentioned in the discussion fol- 
lowing Eq. (fTT| . variations in the correlation change the 
distribution of number counts in M° hs — M§ hs space in 
ways that mimic bias and scatter in the observables. In 
the full cross-calibration, the relation between a(p) and 
a(lnMopt s ) is less pronounced because the information 
from clusters detected only by optical (or SZ) helps to 
break the degeneracy between the correlation and the 
bias. Though not shown, the uncertainty in the bias and 
scatter of the SZ observable scales very similarly to that 
of the corresponding optical nuisance parameters. 

In Figure ITD1 we show a(p,£,E) (left) and a(w) (right) 
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FIG. 9: 1 — a constraints on dark energy and optical-mass nuisance parameters as a function of correlation p for (left) the full 
cross-calibration and (right) the partial cross-calibration. For the solid lines p is a free parameter whereas the dotted lines are 
for p fixed. Both plots are for a WMAP3 cosmology. 



as functions of the prior on the nuisance parameters for 
the full calibration analysis. Throughout we assume that 

CTprior = CT prior (c^) = CTpriorOM^ 8 ) = 0.5cT prior (clj ) = 

0.5er pr i or (&i) = 0.5(T pr i or (ci). We see from the left plot 
that constraints on Que are most sensitive to priors on 
the mass bias, especially the optical mass bias. A prior 
of (0.1) 2 on In A/q P ^ s improves ct(Que) by a factor of ~ 3. 
With priors of (0.1) 2 on all parameters (multiplied by two 
where appropriate) ct(Ode) improves by approximately 
an order of magnitude! 

Constraints on w are largely insensitive to priors on 
the mass-dependent part of the optical scatter, er 2 pt (M), 
or on the SZ mass bias parameters. Priors on the opti- 
cal mass bias improve constraints by at most 12%. The 
constraints are most sensitive to priors on the redshift de- 
pendent scatter nuisance parameters, particularly the op- 
tical scatter. A prior of (0.1) 2 on cr 2 p( (M, z) and <J 2 sz (z) 
decreases a(w) by a factor of ~ 1.3. The full cross- 
calibration can constrain the constant parts of both the 
SZ and optical scatter so that priors on them do not im- 
prove w constraints. The full improvement requires pri- 
ors of (0.01) 2 on all parameters and yields a(w) = 0.022. 



V. CONCLUSIONS AND FUTURE WORK 

We developed a formalism to derive joint cosmologi- 
cal and cluster mass-observable constraints from cluster 
number counts and clustering sample variance of multi- 
ple cluster finding techniques. The improvement we find 
relative to previous works arises from our use of the inter- 
dependence of cluster measurements performed over the 
same patch of sky to cross-calibrate the mass-observable 
relations of the different techniques. When combining an 



SPT-like and DES-like survey, the full cross-calibration 
method yields ~ 2 times smaller constraints on Que and 
w compared to simply adding the Fisher matrices of the 
individual experiments. Furthermore, constraints from 
the full cross-calibration are less sensitive to M th and 
•Zmax than the single mass-observable constraints. 

The cross-calibration places tight constraints on the 
correlation between the observables without the need 
of additional priors. Conversely, priors on the mass- 
variance and bias can significantly improve the dark en- 
ergy constraints. Constraints on Que are most sensitive 
to priors on the mass biases. On the other hand, con- 
straints on w are more sensitive to priors on the redshift- 
dependent part of the scatters. Priors on the optical 
nuisance parameters are more relevant than priors on SZ 
nuisance parameters for both Que and w constraints. 

Our technique can still be improved. Combining more 
than two techniques at a time should further improve 
constraints. But we can only combine multiple tech- 
niques if we use a more efficient binning strategy, to min- 
imizes the number of mass bins needed to extract the 
useful information. It is possible that a more efficient 
binning may improve even the two observable case, par- 
ticularly in cosmologies with low ag. 

Work still needs to be done before the self-calibration 
or full cross-calibration can be applied to real data. The 
cross-calibration estimates presented here are sensitive to 
the parametrization of the mass errors. Simulations are 
needed to determine what parametrizations are robust to 
theoretical and experimental uncertainties. Our results 
assumed a perfect selection, but selection effects may bias 
the cosmological constraints. [35| have shown that if the 
halo selection depends on halo concentration, and if the 
halo bias depends on the assembly history, the sample 
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FIG. 10: a (Q de) (left) arm o~(w) (right) versus the prior on the nuisance parameters for the full calibration analysis. For the 
cyan lines, priors were applied on the mass dependent part of a^pt only- For the solid red lines priors were applied on all 
parameters of o"o P t- Applying priors to all terms of a^ z yields the solid green lines. The blue lines were generated using priors 
on o"o pt and cr^. The dashed green lines have priors on lnM^ la8 and the dashed red lines have priors on lnMopT- Applying 
priors to all nuisance parameters yields the black lines. 



variance due to clustering will deviate from that of a ran- 
dom selection of halos with the same mass distribution. 
If the clustering sample variance is modeled incorrectly, 
the self-calibration may bias the recovered dark energy 
parameters. Since the different cluster surveys are ex- 
pected to have selections with different dependence on 
the halo concentration, cross-calibration should mitigate 
selection effects, though we are yet to test this hypoth- 
esis. Finally, we must still account for the relation be- 
tween photo-z and mass-observable errors. Regardless of 
the simplifications adopted here, we conclude that having 
overlap between surveys is very important to maximize 
the effectiveness of cross-calibration techniques. 
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APPENDIX A: THE PROBABILITY 
DISTRIBUTION OF MULTIPLE OBSERVABLES 

Studies of the cluster mass-observable relation in the 
literature (e.g. 0, [H, H^]), using either simulations or 
observations, typically estimate p(M° hs \M) (by measur- 
ing the scatter of M obs (M)) for a single mass-observable 
or the relation between two observables, p(M° hs a \M£ hs ) , 
for a given M , or equivalently, assuming no evolution in 
M . Thus, it is useful to express p(M obs |M) in terms of 
combinations of p{M ohs \M) and p(M ohs a \M° hs ). This 
can be done using the product rule of probability and 
Baycs' theorem. For example, for two observables, 



p(M ohs \M) = 



p(M: bs 7 M° hs \M) 
p(M: hs \M)p(MZ bs 



Af° bs ,M) 



p{MT\M)p{M^ s \M) 



p(M° hs \M° hs ) 



(Al) 



For n observables, 

71-1 

p(M obs |Af) = J 



nr=/+i p(Mf s |M° bs ) 

p{M° hs ) n -o 



Jp(M° bs \M) (A2) 



i=l 



In this paper we focus on combining two observables 
at a time. Given mass measurement techniques a and b 
we adopt the following paramctrizations: 
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If wc define the vector x = (x a , Xb) and the matrix 



p(M° hB \M) 



1 



: exp 



2o* 



(A3) 



where o is either a or b and 

x a {M° hs ) = In M° hs - In M - In M bi as . (A4) 

The definition of x (M° bs ) here differs from the definition 
of x(M obs ) in Eq. © by a factor of y/2af nM . 
Similarly, 



we obtain 



B 



+ -4- 

Kb "lb 



A=^ [x T Bx] 



(A10) 



(All) 



exp 



obs\ " 



2a 2 , 

ab 



(A5) 



where 



x ab (A/° bs ) = In M° bs - In M bias - In M° bs + In M bias 
= x a - a; b (A6) 

Combining all the probability distributions above, 
yields 



With the above form for A, it is clear that we can repre- 
sent p{M° hs , M2° S \M) by a bivariate Gaussian distribu- 
tion, 



p(M 1 obs ,M 2 obs |M) 



(2 7 r)det(C)V2 exp [- xTC " lx ] 



(A12) 



where C is the covariancc matrix defined as 



p(MS hs ,MS hs \M) 



: exp[A], (A7) 



where 



a a pa a cTb 
pa a a h a\ 



(A13) 



.4 



X^ {x a X(j) 



2^ 2al 



(A8) 



and we have simplified the notation by writing a x to rep- 
resent crinMx- Rearranging the terms in IA8I we find 



A = Zi 
2 



1 1 



°a °ab 



(A9) 



and p is the correlation coefficient defined in terms of u a , 
cr;,, and a a b as 







p 










Kb) 





M + < b )(* 2 b+< b )] 



2 mV2 



(A14) 
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